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Abstract. This article summarizes and critically reviews measurements of charged- 
particle multiplicity distributions and pseudorapidity densities in p + p(p) collisions 
between y/s = 23.6 GeV and y/s = 1.8 TeV. Related theoretical concepts are briefly 
introduced. Moments of multiplicity distributions are presented as a function of y/s. 
Feynman scaling, KNO scaling, as well as the description of multiplicity distributions 
with a single negative binomial distribution and with combinations of two or more 
negative binomial distributions are discussed. Moreover, similarities between the energy 
dependence of charged-particle multiplicities in p+p(p) and e + e~ collisions are studied. 
Finally, various predictions for pseudorapidity densities, average multiplicities in full 
phase space, and multiplicity distributions of charged particles in p + p(p) collisions at 
the LHC energies of yfs = 7 TeV, 10 TeV, and 14 TeV are summarized and compared. 
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1. Introduction 

The charged-particle multiplicity is one of the simplest observables in collisions of 
hadrons, yet it imposes important constraints on the mechanisms of particle production. 
Experiments have been performed with cosmic rays, fixed target setups, and particle 
colliders. These measurements have been used to improve, or reject, models of particle 
production which are often available as Monte Carlo event generators. Considering 
only the number of produced charged particles is a drastic reduction of the complex 
information contained in the final state of a particle collision, especially if the 
kinematic properties are neglected. Nevertheless, the multiplicity distribution, i.e., the 
probability distribution of obtaining a definite number of produced particles, still 
contains information about particle correlations. By definition, all information about 
correlations is removed when the data are reduced to the average charged particle 
multiplicity (N&). Distributions that still partly reflect kinematic properties are, e.g., 
the pseudorapidity (rj) density and the transverse momentum (p?) distribution which 
are one- dimensional projections of the kinematic properties. More sophisticated is the 
study of correlations of the final-state particles. Still, on a rather global level one 
typically studies the dependence of the average pt on the event multiplicity. More 
complicated correlation studies are, e.g., in the realm of Hanbury-Brown and Twiss 
(HBT) interferometry. 

This review focuses on the charged-particle multiplicity distribution and the 
pseudorapidity density. Correlations will only be partly covered in the discussion 
of moments of the multiplicity distributions. The main topics of this review cover 
basic theoretical concepts and their applicability to data, experimental challenges, 
experimental results, as well as predictions for the Large Hadron Collider (LHC) 
energies. Earlier reviews can be found in [TJ [2], [3J SI El [6J, [T] . The objective of this review 
is to give a general overview of the field, to discuss the relevant theoretical aspects, 
and to provide references for the reader who may want to study certain topics in more 
detail. From the description of collider experiments and their results the reader should 
obtain an understanding where the limitations of these theoretical descriptions lie and 
what the experimental trends as functions of centre-of-mass energy are. Furthermore, 
an objective is to discuss open experimental issues on which data from the LHC will 
provide the needed clarification. The study of the charged-particle multiplicity is an 
essential topic at the beginning of data-taking at the LHC. A precise characterization 
of the underlying event, i.e., of those particles of the event not related to the hard 
parton-parton scattering, is a precondition for most of the flagship research topics of 
the LHC. 

1.1. Brief Overview of Multiplicity Measurements 

The charged-particle multiplicity is a key observable for the understanding of multi- 
particle production in collisions of hadrons at high energy. The probability P(n) for 
producing n charged particles in the final state is related to the production mechanism of 



Charged- Particle Multiplicity in Proton-Proton Collisions 



4 



the particles. The multiplicity distribution follows a Poisson distribution if the final-state 
particles are produced independently. In this case the dispersion D = a/ (n 2 ) — (n) 2 is 
related to the average multiplicity as D = yjn)- Deviations from a Poisson distribution 
indicate correlations. 

Measurements of multiplicity distributions provide significant constraints for 
particle-production models. However, the discrimination between models typically 
requires more differential measurements. Particle production models are typically based 
on Quantum Chromodynamics; however, they necessarily contain a phenomenological 
component as the formation of particles involves a soft scale outside the realm of 
perturbative techniques. 

Early measurements of multiplicity distributions in e + e~ collisions at the centre-of- 
mass energy y/s = 29 GeV could approximately be described with a Poisson distribution 
[EJ [9]. Proton-proton collisions, on the other hand, exhibited broader multiplicity 
distributions. The energy dependence of the dispersion in non-single diffractive p + p 
collisions could approximately be described as D oc (n) up to the maximum ISR energy 
of i/i = 62 GeV [10] . A simple interpretation was that the correlations in p + p were 
related to an impact-parameter dependence of the charged-particle multiplicity [TT] . 

Interest in multiplicity distributions was stimulated by the paper of Koba, Nielsen, 
and Olesen in 1972, in which they derived theoretically that multiplicity distributions 
should follow a universal scaling at high energies (KNO scaling). KNO scaling was 
derived based on Feynman scaling, i.e., based on the assumption that the rapidity density 
diV c h/ dy at y = reaches a limiting value above a certain energy which corresponds to an 
asymptotic scaling of the total multiplicity as (n) oc In y/s. With z = n/(n) the function 
^(z) = (n)P(n) was expected to asymptotically reach a universal energy-independent 
form. Bubble chamber data between i/i w 6 GeV and 24 GeV indicated an onset of KNO 
scaling already at a/s w 10 GeV [12]. At the ISR among other observations the relation 
D oc (n) indicated that KNO scaling was satisfied [10J (although deviations were noted in 
[13J). However, it was found that the average multiplicity increased faster with energy 
than In yfs. Thus, the theoretical basis for KNO scaling was found to be empirically 
false. In 1985, breaking of KNO scaling was observed by the UA5 collaboration in p + p 
collisions at y'i = 540 GeV [14] . In a later publication UA5 concluded that KNO scaling 
was already violated at y/s = 200 GeV [15] . 

UA5 found that multiplicity distributions up to y/s = 540 GeV can be well described 
by a negative binomial distribution (NBD) [T4] which is defined by two parameters 
(n) and k. The parameter k determines the width. It was found that 1/k increases 
approximately linearly with In ^/i whereas KNO scaling corresponds to a constant, 
energy-independent 1/k. However, deviations from the NBD were discovered by UA5 at 
yfs = 900 GeV and later confirmed at the Tevatron at yfs = 1800 GeV [IB] . A shoulder 
structure appeared at n w 2(n) which could not be described with a single NBD. This 
led to a two-component model by Giovannini and Ugoccioni in 1999 who described 
the measured data by a combination of two NBDs, interpreting one as a soft and one 
as a semi-hard component. An alternative description interpreted the results in terms 
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of multiple-parton interactions which become more important at higher energies. The 
superposition of several interactions affects the multiplicity distribution and therefore 
potentially explains the deviation from the scaling found at lower energies. 

Multiplicity measurements in e + e~ between 10 < y/s < 91.2 GeV showed that also 
in this system the dispersion D scaled approximately linearly with (n) and that KNO 
scaling was approximately satisfied jTTj. Thus, the Poisson shape at yfs ~ 30 GeV was 
merely accidental. Also in e + e~ the NBD provided a useful description of the data. 
However, the Delphi experiment found that at \fs = 91.2 GeV multiplicity distributions 
in restricted rapidity intervals exhibited a shoulder structure similar to the observations 
in p + p(p) [18J. This shoulder was attributed to three- and four-jet events which have 
larger average multiplicities than two-jet events. 

1.2. Structure of the Review 

This review is divided into two parts. The first introduces basic theoretical concepts, 
the second concentrates on experimental data. 

The first part discusses scaling properties of the multiplicity, i.e., Feynman and 
KNO scaling. We recall the definitions of various moments used in this review. 
Furthermore, negative binomial distributions (NBDs) and two-component models are 
discussed. Similarities between p+p{p) and e + e~ collisions are investigated in the context 
of QCD. 

The second part starts with the introduction of important aspects of the multiplicity 
analysis. It is demonstrated that measured multiplicity distributions need to be unfolded 
to obtain the original (true) distribution. Subsequently, measurements in the centre- 
of-mass energy range from yfs = 23.6 GeV to 1.8 TeV are presented. The applied 
analysis methods and error treatments are discussed. Selected pseudorapidity density 
and multiplicity distributions are shown, and their agreement with the theoretical 
descriptions introduced in the first part is assessed. The dependence of the multiplicity 
on the collision energy is analyzed. Results from hadron and lepton colliders are 
compared and their universalities and differences investigated. The behaviour of the 
moments of the multiplicity distribution as a function of y/s are studied. Then 
we investigate how single NBDs and the combination of two NBDs describe the 
distributions. This part concludes with a discussion of open experimental issues. 

An overview of available predictions for the LHC energy range is given in the final 
section of the review. 

2. Theoretical Concepts Related to the Charged-Particle Multiplicity 

Before introducing various analytical descriptions of the multiplicity distribution, it 
is important to remark that in the case that the underlying production process can 
be described by uncorrelated emission the multiplicity distribution is expected to be 
of Poisson form. Any deviation from this indicates correlations between the produced 
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particles. Forward-backward correlations have in fact been measured, e.g., by UA5 in 
p + p collisions [19] but are not further discussed here. 

Many authors have tried to identify simple analytical forms that reproduce the 
multiplicity distributions at different y/s requiring only a simple rescaling or changing 
only a few parameters as function of y/s. At LHC energies a regime is reached where the 
average collision contains multiple parton interactions whose products might undergo 
final-state interactions. Considering the wealth of processes that are expected at large y/s 
it is not obvious whether approaches based on analytical forms are capable of capturing 
the underlying physics. 



2.1. Feynman Scaling 

Feynman concluded that for asymptotically large energies the mean total number of any 
kind of particle rises logarithmically with y/s [20J: 

(N) oc In W oc In with W = y/s/2. (1) 

His conclusions are based on phenomenological arguments about the exchange of 
quantum numbers between the colliding particles. He argued that the number of particles 
with a given mass and transverse momentum per longitudinal momentum interval p z 
depends on the energy E = E(p z ) as 

— ~i. (2) 

dp, E y > 

This was extended to the probability of finding a particle of kind % with mass m and 
transverse and longitudinal momentum p? and p z : 

MPT,x F =p z /W)^d 2 p T (3) 
with the energy of the particle 



E = \Jm 2 +pip +p 2 z . (4) 

The function fi(pT,%F) denotes the particle distribution. Feynman's hypothesis is 
that fi becomes independent of W at high energies. This assumption is known as 
Feynman scaling and fi is called the scaling function or Feynman function. The variable 
Xp = p z /W, called Feynman-x, is the ratio of the longitudinal momentum of the particle 
p z to the total energy of an incident particle W. Integration of expression (J3j) results in 
(N) oc lnW. A derivation is given in Appendix lAl 

Considering that the maximum rapidity in a collisions increases also with In y/s, it 
follows that: 

— — = constant, (5) 
dy 

i.e., the height of the rapidity distribution around mid-rapidity, the so-called plateau, 
is independent of y/s. Equivalently, the pseudorapidity at mid-rapidity dN/dr]\ v=0 is 
approximately constant if Feynman scaling holds (the pseudorapidity is defined as 
r] = (1/2) ln[(p +Pl)/(p — Pl)] — — lntan?9/2 where p [p£) is the total (longitudinal) 
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momentum of the particle and d the angle between the particle and the beam axis). 
Here the transformation from y to 77 has to be taken into account. It depends on the 
average tut = \fm 2 + pj, of the considered particles which, however, is only weakly 
energy-dependent. An estimate based on the Pythia event generator shows that the 
ratio (dN ch /dy)/(dN ch /dr]) changes by only 1 - 2% from sfs = 100 GeV to 1 TeV. 
Furthermore, this transformation causes a dip in the distribution around rj w which is 
not present in the rapidity distribution itself (see Section 13^1 where measured dN ch /drj 
distributions are shown). 



2.2. Moments 

To describe properties of multiplicity distributions, e.g., as a function of s/s, it is 
convenient to study their moments. All moments together contain the information of 
the full distribution. In practice only the first few moments can be calculated with 
reasonable uncertainties due to limited statistics. The reduced C-moments are defined 
by 

q <»>' (Z n nP n y W 
where q is a positive integer and P n the probability for producing n particles. The 
normalized factorial F-moments are defined by: 

(ra(rc - 1) . . . (w - g + 1)) 

q = W q ' {) 

It can be shown that these moments contain in integrated form the correlations of the 
system of particles (see e.g. [2TJ H])- F° r a Poisson distribution one obtains F q = 1 for 
all values of q. 

The D-moments are defined by: 

D q =((n- (n)Y)^. (8) 

D = D 2 is referred to as dispersion. The equations used to calculate the uncertainties 
of these moments are given in Appendix [Bl 

The normalized factorial cumulants K q can be calculated recursively from the 
normalized factorial moments according to J2TJ HJ [22] 

K q = F q - £ ( Q 7 1 )K q _ lFi . (9) 



i=l 



The cumulants of rank q represent genuine g-particle correlations not reducible to the 
product of lower-order correlations. The if-moments are defined by 

H q = ^- (10) 

The if-moments are interesting because higher-order QCD calculations predict that 
these oscillate as a function of the rank q [23]. K- and if -moments are only briefly 
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discussed in the following. For more details about the definitions of the moments see, 
e.g., 0H2U- 

The analysis of moments helps unveil patterns and correlations in the multi- 
particle final state of high-energy collisions in the presence of statistical fluctuations 
due to the limited number of produced particles. One pattern extensively searched for 
experimentally was self-similar or fractal structures in multi-particle spectra (23]. The 
observation of fractal structures is of great interest because it imposes strong constraints 
on the underlying particle-production mechanism. A natural candidate for explaining 
self-similarity is the parton cascade [25]. Bialas and Peschanski introduced the concept 
of intermittency to search for self-similarity in multiplicity distributions [261 EZ] • They 
proposed to study the normalized factorial moments F q in decreasing pseudorapidity 
intervals Srj. Self-similarity in the particle production process would then manifest itself 
as a power-law behaviour of the F q {8rj) as a function of the bin size 5r]: 

F q {5rf) oc (Sri)-**. (11) 

For more information we refer the reader to [211 EH EH E] . 

2.3. Koba-Nielsen-Olesen (KNO) Scaling 

KNO scaling was suggested in 1972 by Koba, Nielsen, and Olesen [30]. Their main 
assumption is Feynman scaling. 

KNO scaling is derived by calculating 

(n(n — 1) . . . (n — q + 1)} = 

J f {q \xi,PT^---;x g ,PT, g )^^dp^ 1 ---^^dp^ q (12) 

which is an extension of the expression used in the derivation of Feynman scaling (see 
Eq. (1A.4jl ) that uses a function f( q ' that describes g-particle correlations (q particles with 
energy E q , longitudinal momentum p Zjq , transverse momentum pr,q, and Feynman- a; x q ). 
Integration by parts is performed for all Xj and it is proven that the resulting function 
is uniquely defined by moments. This yields a polynomial in Ins. With a substitution 
of the form (n) oc In s the multiplicity distribution P(n) is found to scale as 

P (n) = tU( A) + O (-^V (13) 



(n) y (n) y ' V( n ) 2 / 
where the first term results from the leading term in In s, that is (In s) q . The second term 
contains all other terms in Ins, i.e., (Ins) 9 ' for q' < q. \?(z := n/(n)) is a universal, i.e., 
energy-independent function. This means that multiplicity distributions at all energies 
fall on one curve when plotted as a function of z. However, ^f(z) can be different 
depending on the type of reaction and the type of measured particles. 
The C-moments, 



C q = / z q *(z)dz, (14) 
define *&(z) uniquely [30] ■ Substituting z = n/(n) results in Eq. (jBJ). 
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When the scaling hypothesis holds the moments are independent of energy. 
Experimentally one can determine D 2 = (n 2 ) — (n) 2 ; the relation D/(n) = const. 
follows from Eq. ( Fl3|) (if *$f(z) is not a 5 function, see [30J). 

It has been pointed out [31] that the conclusion that the multiplicity distribution 
follows a universal function is only an approximation (neglecting the second term in 
Eq. f TTB"]) ). Therefore the exact result is that the factorial moments (see Eq. ©) are 
required to be constant, not the reduced moments (which follow from Eq. f fT4|) ). This is 
addressed further in Section 13.61 

The description of discrete data points with a continuous function in Eq. ( fl3|) is an 
approximation valid for (n) ^> 1. A generalized KNO scaling which avoids this problem 
is described in [321 [33]. Moreover, different scaling laws for multiplicity distributions 
were proposed (see e.g. (HE]), among those the so-called log-KNO scaling [34] which 
predicts a scaling of the form 

1 /Inn + c(s)\ 

P{n) = ry-^ > (15) 



A(«rv K*) 

where tp is a universal, energy-independent function. The energy-dependent functions 
A(s) and c(s) correspond to (n) and the multiplicity related to the leading particles, 
respectively. 

2.4- Negative Binomial Distributions 

The Negative Binomial Distribution (NBD) is defined as 

pT^=^ n+k n ~ i y^-v) n v k - as) 

It gives the probability for n failures and k — 1 successes in any order before the fc'th 
success in a Bernoulli experiment with a success probability p. The NBD is a Poisson 
distribution for k~ x — > and a geometrical distribution for k — 1. For negative integer 
k and (n) < —k the distribution is a binomial distribution where —k is the number 
of trials and —(n)/k the success probability (see Appendix ICl). The continuation to 
negative integer k is performed by writing the binomial in terms of the T function and 
using the equation T(x + 1) = xT(x): 

n + Jfe-l)! T(n + k) 

17) 




n\{k-l)\ r(n+l)r(fc) 
n + k - 2) • . . . • k 



r(n + 1) 

The mean of the distribution (n) is related to p by p^ 1 = 1 + (n)/k. This leads to the 
form of the NBD that is commonly used to describe multiplicity distributions [35, 36J: 

n + k-l \ ( (n)/k Y 1 



>NBD, 



{n) ' kK ! I n \l + (n)/k) (l + (n)/k) k 
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Figure 1. Examples of negative binomial distributions. 



The dispersion D and the second-order normalized factorial moment F2 of the NBD are 
given by 

d =\M i+ ^) ; F2=i+1 *- (19) 

Moreover, k is related to the integral of the two-particle pseudorapidity correlation 
function 6*2(^1, ^2) as shown in [37[ 138] . 

Figure [1] shows normalized NBDs for different sets of parameters. NBDs for values 
of k that lead to characteristic shapes are also shown: the case of a large k where the 
distribution approaches a Poisson distribution is shown, the case with a negative integer 
k where the function becomes binomial, and the case of k being positive and smaller 
than unity. ^(n) follows KNO scaling if k is constant (energy- independent). This 
can be seen from the KNO form 

^nbd(^) = (2°) 

which holds in the limit (n)/k ^> 1 [3]. Therefore, studying k as a function of i/i for 
multiplicity distributions described by NBDs indicates whether KNO scaling is fulfilled. 
NBDs have been shown to provide a useful parameterization of multiplicity distributions 
in p + p{p) collisions as well as in various other systems including e + e~ [9j [39], \i + p 
PU] and central nucleus-nucleus collisions [^01 128]. However, the NBD has been shown to 
underestimate particle correlations found in e + e~ data, which can be shown by studying 
factorial moments and cumulants (SJ W2\ . 

The physical origin of a multiplicity distribution following a negative binomial form 
has not been ultimately understood. However, one approach is to use the recurrence 
relation of collisions of multiplicities n and n + 1 [37J. This relation is defined such that 
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for uncorrelated emission it is constant; any departure shows the presence of correlations. 
Evaluating 



for a Poisson distribution P(n) = \ n e~ x /n\ (representing uncorrelated emission) yields 
the result that g(n) = A is indeed constant. The term n+1 in Eq. (I2T]) can be understood 
by considering that the particles are in principle distinguishable, e.g., by their momenta; 
therefore it has to be taken into account that a collision of multiplicity n + 1 can be 
related to n + 1 collisions of multiplicity n (by removing any single one of the n + 1 
particles). 

For NBDs, Eq. (I2TT) can be written as 



A model of partially stimulated emission identifies a in Eq. f l22l) with the production 
of particles independent of the already present particles and bn with emission that 
is enhanced by already present particles (Bose-Einstein interference). From g(n) = 
a(l + n/k) (which follows from Eq. fl22]) ) one sees that k~ l is the fraction of the already 
present particles n stimulating emission of additional particles. Following these rather 
simple assumptions results in two conclusions that are confirmed experimentally: 1) k 
increases when the considered 77-interval is enlarged (because the range of Bose-Einstein 
interference is finite, the fraction of present particles stimulating further emission 
reduces); 2) k decreases with increasing -y/i for a fixed ^-interval (the density of particles 
in the same interval increases because (n) increases) [37J. 

The multiplicity distribution can be deduced as being of negative binomial shape 
within the so-called clan model [37J H3j [5]. It describes the underlying production by a 
cascading mechanism. In the clan model a particle can emit additional particles, e.g., 
by decay and fragmentation. A clan (or cluster) contains all particles that stem from 
the same ancestor. The ancestors themselves are produced independently. 

The production of ancestors, and thus clans, is governed by a Poisson distribution 
P(N, (N)) where (N) is the average number of produced clans. The probability 
of producing n c particles in one clan F c (n c ) can be determined from the following 
considerations: One stipulates that without particles there is no clan: 



and assumes that the production of an additional particle in a clan is proportional to 
the number of already existing particles with some probability p (see also Eq. (l2Tj) ): 




(21) 



g(n) = a + bn with k = a/b and (n) = a/(l 
or a = (n)k/((n) + k) and b ■ 



b) 

(n)/((n) + k). 



(22) 



F c (0) = 



(23) 




(24) 



By iteration, the following expression is obtained: 




~7i c — 1 



(25) 
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The multiplicity distribution that takes into account the distribution of clans and the 
distribution of particles among the different clans is: 

n 

P(n) = P (N, (N))J2*F c (n 1 )F c (n 2 )...F c (n N ), (26) 

N=l 

where Yl* runs over all combinations rij for which n = YliLi n i * s valid. It can be shown 
that Eq. ([26]) is a NBD where (n) = (N) F c (l) / (1 - p) and k = (N)F c (l)/p [37]. The 
average number of clans (N) and the average multiplicity (n c ) within a clan, in turn, 
are related to the NBD parameters (n) and k via [37] 

For the case of n = 2 it can be shown using Eqs. ( 1241) and ( 1261) that k is the relative 
probability of obtaining one clan with two particles with respect to obtaining two clans 
with one particle each. 



2.5. Two- Component Approaches 

2.5.1. Combination of two NBDs Multiplicity distributions measured by UA5 have 
been successfully fitted with a combination of two NBD-shaped components [2]. A 
systematic investigation has been performed by Giovannini and Ugoccioni who interpret 
the two components as a soft and a semi-hard one [45j. These can be understood as 
events with and without minijets, respectively (the authors of [15] use a definition from 
the UAl collaboration: a minijet is a group of particles having a total transverse energy 
larger than 5 GeV): the fraction of semi-hard events found corresponds to the fraction of 
events with minijets seen by UAl. It is important to note that this approach combines 
two classes of events, not two different particle-production mechanisms in the same 
event. Therefore, no interference terms have to be considered and the final distribution 
is the sum of the two independent distributions. 

In this approach, the multiplicity distribution depends on five parameters, that may 
all be y/s dependent: 

P(n) = owt x P£f D k (n) + (1 - a sof t) x Pflf D k (n). (28) 

\ / hon WsoftiKsoftV ' v hoIt/ W semi-hard :«semi-hardV ' v > 

The parameters and their dependence on \fs are found by fitting experimental data. 
Note that (n) is about two times larger in the semi-hard component than in the soft 
component. Furthermore, the fits show that the soft component follows KNO scaling, 
whereas the semi-hard component violates KNO scaling. This is discussed in more detail 
in Section 13.81 

A modified formulation of this approach includes a third component representing 
events initiated by hard parton scattering. This class is also of NBD form with the 
parameter k being smaller than 1 resulting in a substantially different shape (see 
Figure [T|). Furthermore, the parameter (n) is much larger than for the other two 
components. For more details see [46J. 
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2.5.2. Interpretation in the Framework of Multiple-Parton Interactions Above ISR 
energies parton-parton interactions with high momentum transfer (i.e. hard scatterings) 
are expected to contribute significantly to the total charged-particle multiplicity in 
p + p{p) collisions [17J HHJ H9]. Hard parton-parton scatterings resulting in QCD jets 
above a transverse momentum threshold can be described by perturbative QCD. Softer 
interactions either require a recipe for the regularisation of the diverging QCD jet cross 
section for p? — > jH] or models for soft-particle production, see e.g. [49J. The transverse 
momentum scale Pt,o which controls the transition from soft to hard interactions is 
typically around 2 GeV/c. In these QCD-inspired models two or more independent hard 
parton-parton scatterings frequently occur within the same p + pip) collision [48, 50J. 
These models explain many observed features of these collisions including the increase 
of the total inelastic p + pip) cross section with a/s, the increase of (px) with the 
charged particle multiplicity iV c h, the increase of (pr) with y/s, and the increase of 
dN c h/dr) with ^/s. High-multiplicity collisions in these models are collisions with a large 
number of minijets. In the minijet model of ref. [IH] up to eight independent parton- 
parton scatterings are expected to significantly contribute to the high-multiplicity tail 
of the multiplicity distribution at y/s = 1-8 TeV. The violation of KNO scaling within 
this model is also attributed to the onset of minijet production. Strong correlations 
between multiple parton interactions and the shape of the multiplicity distribution are 
also present in the Pythia event generator [51]. In Pythia the multiplicity distribution 
turns out to be strongly related to the density profile of the proton (HI [50]. A purely 
analytic model based on multiple parton interactions is the IPPI model |52j . In this 
model the multiplicity distribution is a superposition of negative binomial distributions 
where each NBD represents the contribution of collisions with a given number of parton- 
parton scatterings. 

A data-driven approach to define and identify double parton interactions and thus a 
second component in the multiplicity distribution is given in [53J where the multiplicity 
distribution is plotted in a KNO-like form and the part of the distribution for which KNO 
scaling holds is subtracted. This is done by comparing the distribution to a KNO fit that 
is valid at ISR energies. The KNO-like variable z' = n/ (%), with (ni) being the average 
multiplicity of the part of the distribution that follows KNO scaling, is used. Due to 
the large errors in the low-multiplicity bins of the specific data set at 1.8 TeV analyzed 
in [53], (ni) cannot be satisfactorily determined. Therefore, it is found by using the 
empirical relation (ni) w 1.25 n max where n max is the most probable multiplicity which 
is inferred from the KNO fit at ISR energies. The authors find an interesting feature 
when the part that follows the KNO fit is subtracted and the remaining part is plotted 
(not shown here). The remaining part does not follow KNO scaling, its most probable 
value z' mgx is 2, and its width is about s/2 times the width of the KNO distribution. 

This procedure to identify the second component is similar to the one described 
in the previous section. However, the authors of [53J conclude that the second part of 
the distribution is the result of two independent parton-parton interactions within the 
same collision. The cross sections of the two contributions (eri,er 2 ) can be calculated as 
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a function of yfs. It is found that <j\ is almost independent of y/s, whereas increases 
with sfs. However, it remains unclear if two parton-parton interactions in the same 
collision evolve independently to their final multiplicity due to final-state interactions. 

The same reasoning and data are used in [51] to identify a third component, three 
independent parton-parton interactions. This is extended in [55] to predict a multiplicity 
distribution for the LHC design energy of 14 TeV which is shown in Section [H 

2.6. Similarities between p + p(p) ande + e~ Collisions and QCD Predictions 

The theoretical description of the formation of hadrons necessarily involves a soft 
scale so that perturbative QCD cannot be directly applied. Therefore, models for soft 
interactions, like those from the large class of string models, are often used to describe 
multiplicity distributions in collisions of hadrons (see for example the Dual Parton Model 
[56J or the Quark-Gluon String Model [57]). It is instructive to compare multi-particle 
production in p + p(p) and e + e~ collisions. In p + p(p) collisions without a hard parton- 
parton interaction, as well as in e + e~ collisions, particle production can be viewed 
as resulting from the fragmentation of colour-connected partons. In e + e~ collisions the 
colour field extends along the jet axis whereas in p+p(p) it stretches along the beam axis. 
Based on this analogy it is not unreasonable to expect some similarities between particle 
production in p + p(p) and e + e~ collisions. However, the configurations of the strings in 
the two cases are different. In addition, processes with different energy dependences will 
contribute significantly to the overall particle multiplicity at high energies [6]: minijet 
production in hard parton-parton scattering in the case of p + p(p) collisions and hard 
gluon radiation in e + e~ collisions. Thus, theory does not provide convincing arguments 
for this simple analogy, yet striking similarities were indeed observed (see Section 133]) . 

In e + e~ collisions moments of the multiplicity distributions are rather well 
described in an analytical form with perturbative QCD in the modified leading 
logarithmic approximation (MLLA) [58]. The evolution of the parton shower is described 
perturbatively to rather low virtuality scales close to the hadron mass. Using the 
hypothesis of local parton-hadron duality (LPHD) one then assumes a direct relation 
between parton and hadron multiplicities. In e + e _ collisions the next-to-leading-order 
(NLO) prediction for the average multiplicities is given by 



with a = \/6tt 12/23 and b = 407/838 for 5 quark flavors [59J [60j [61]. Fixing the strong 
coupling constant a s at the Z mass to a s (M|) = 0.118 leaves an d ^4lphd as free 
parameters. An excellent parameterization of the experimental multiplicities can be 
obtained in this way (see Section \3.5h . An analytical form at next-to-next-to- next-to- 
leading order (3NLO) is available in [4"| 162]. 

The second factorial moment for multiplicity distributions in e + e~ collisions 




(29) 
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is given at NLO by [63] 



F 2 (y^) = y(l-0.55^a s (v/i)) 



(31) 



This QCD prediction for F 2 is about 10% above the experimental values for y/s = 
10 — 91.2 GeV [SI]. The calculation of higher moments shows that the theoretical 
multiplicity distributions in e + e~ collisions are well approximated by negative binomial 
distributions 163] with 



This implies that asymptotically (a s — > as s — > oo) the multiplicity distributions 
in e + e~ collisions satisfy KNO scaling. However, the KNO form of the multiplicity 
distributions up to the maximum LEP energy (corresponding to a s > 0.1) differs 
significantly from the asymptotic form. 

Even before QCD was known, Polyakov found that KNO scaling occurs naturally in 
a picture of hadron production in a self-similar scale-invariant branching process [65, 66J. 
For e + e~ collisions the KNO form was given as 



where a(z) is a monomial. Thus, ip(z) is a gamma distribution in z^. In the double 
logarithmic approximation (DLA) of QCD, valid at asymptotic energies, the KNO form 
of the multiplicity distribution in jets can be calculated [67|[68]. Higher-order corrections 
to this form were found to be large [69] so that the preasymptotic distributions, e.g., at 
LEP energies, are quite different from the asymptotic DLA form [70] . 

3. Charged-Particle Multiplicity Measurements 

This part of the review presents p + p(p) measurements that have been performed by 
experiments at hadron colliders, i.e., the ISR, SppS, and Tevatron. The Intersecting 
Storage Rings (ISR), the very first hadron collider, was operating at CERN between 
1971 and 1984. It collided p on p, p, and a-particles at a maximum centre-of-mass 
energy of 63 GeV. The Super Proton Synchrotron (SPS) which has operated at CERN 
since 1976 has accelerated in its lifetime electrons, positrons, protons, anti-protons, and 
ions. After modifications it operated as a collider and provided p on p collisions with a 
maximum \fs of 900 GeV, at that time it was called SppS. The Tevatron at the Fermi 
National Accelerator Laboratory (FNAL) came into operation in 1983. It provides p + p 
collisions at energies up to \fs = 1.96 TeV. In addition, results from bubble chamber 
experiments are included where appropriate. 

References of experimental measurements at these colliders are given, discussing 
their analysis methods and error treatments. A selection of measurements of experiments 
at these colliders is shown to assess the validity of the models that have been 
described above. Additionally, the experimental challenges are recalled and unresolved 
experimental inconsistencies are discussed. 





(32) 



if)(z) oc a(z) exp(— z^) with \i > 1 , 



(33) 
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a) Non-diffractive b) Single-diffractive c) Double-diffractive 




Figure 2. Rapidity distributions of charged particles per event for different processes, 
non-diffractive (left panel), single-diffractive (centre panel), and double-diffractive 
(right panel) . These have been obtained with Pythia at ,/s — 900 GeV. 



3.1. Analysis Techniques 

3.1.1. Event Classes Inelastic p+p collisions are commonly divided into non-diffractive 
(ND), single-diffractive (SD), and double-diffractive (DD) events. Figure [2] shows 
rapidity distributions of those classes obtained with Pythia to illustrate their differences. 
Non-diffractive collisions (left panel) have many particles in the central region, with their 
yield steeply falling towards higher rapidities. In a single-diffractive collision only one 
of the beam particles breaks up and produces particles at high rapidities on one side. 
In the centre panel only those single-diffractive collisions are shown where the particle 
going to positive y breaks up. The other incoming particle, still intact and with only 
slightly altered momentum, is found near the rapidity of the beam. In a double-diffractive 
collision (right panel) both beam particles break up and produce particles. A dip can 
be seen in the central region. The different scales of the three distributions should be 
noted. Integrating the histograms demonstrates that the average total multiplicity is 
about a factor of four higher in non-diffractive collisions than in diffractive collisions. 

Measurements are usually presented for the sample of all inelastic collisions or non- 
single- diffractive (NSD) collisions, i.e., not considering the SD component. The reason 
for the latter choice is that trigger detectors are usually less sensitive to SD events due to 
their topology: few particles are found in the central region and only the incident proton 
is found on one side. To select a pure NSD sample for the analysis, depending on the 
detector geometry, SD events that pass the trigger can be rejected by their reconstructed 
topology, e.g., to reject events where in one hemisphere no track or only one track is 
found that has 80% of the incident proton momentum. Collider detectors operating 
today have limited phase space acceptance at higher rapidities. Therefore they allow 
only a limited event-by-event decision of the occurred process and rely on Monte Carlo 
simulations for the subtraction of SD events. Naturally a larger systematic uncertainty 
is associated with this correction method. 
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Figure 3. The need for unfolding. The left panel shows a measured spectrum in a 
limited region of phase space superimposed with the true distribution that caused 
the entries in one single measured bin (exemplarily at multiplicity 30 indicated by 
the line). Clearly the shape of this true distribution depends on the shape of the 
multiplicity distribution given by the model used (a suggestive example is if the true 
spectrum stopped at a multiplicity of 40: the true distribution that contributed to the 
measured multiplicity of 30 would clearly be different, still events at a multiplicity 
of 30 would be measured). Inversely, in the right panel, the true distribution is 
shown superimposed with the measured distribution caused by events with the true 
multiplicity 30 (exemplarily). The shape of this measured distribution still depends on 
the detector simulation, i.e., the transport code and reconstruction, but not on the 
multiplicity distribution given by the model (only events with multiplicity 30 contribute 
to the shown measured distribution). 



3.1.2. Unfolding of Multiplicity Distributions Given a vector T representing the true 
spectrum, the measured spectrum M can be calculated using the detector response 
matrix R: 

M = RT. (34) 

The aim of the analysis is to infer T from M. Simple weighting, i.e., assuming that a 
measured multiplicity m is caused 'mostly' by a true multiplicity t, would not be correct. 
This is illustrated in Figure [31 Analogously, adding for each measured multiplicity 
the corresponding row of the detector response matrix to the true distribution is also 
incorrect. This is model-dependent and thus may produce an incorrect result. On the 
other hand the measured spectrum which is the result of a given true multiplicity is only 
determined by the detector simulation and is independent of the assumed spectrum. 
Given a measured spectrum, the true spectrum is formally calculated as follows: 

T = R^M. (35) 
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R^ 1 cannot be calculated in all cases, because R may be singular; e.g. when a poor 
detector resolution causes two rows of the matrix to be identical. This can in most cases 
be solved by choosing a more appropriate binning (combining the entries in question). 
Even if R can be inverted, the result obtained by Eq. (|35|) contains usually severe 
oscillations (due to statistical fluctuation caused by the limited number of measured 
events and events used to create the response matrix). The effect of the limited number 
of measured events can be illustrated with the following example [TlJ : a square response 
matrix is assumed to describe the detector (rows: measured multiplicities; columns: true 
multiplicities): 

/ 0.75 0.25 ••• \ 

0.25 0.50 0.25 
R= 0.25 0.50 0.25 . (36) 

0.25 0.50 

V i '"J 

A true distribution T is assumed, and the expected measured distribution M is 
calculated using Eq. (l3lj) . The distribution M is used to generate a sample of 10 000 
measurements: M. Using Eq. (1351) the corresponding true distribution T is calculated. 
Figure H] shows these four distributions. Although the resolution effect on the shape 
of the measured distribution (left histogram) is very small, the unfolded solution (right 
histogram) suffers from large non-physical fluctuations. Clearly, this is not the spectrum 
that corresponds to the true one. 

The information that is lost due to the resolution cannot be recovered. To work 
around the consequence of non-physical fluctuations the result is usually constrained 
with a priori knowledge about the smoothness of the true distribution. Methods that 
allow the recovery of the true distribution (with the limitation that structures in 
the distribution that are smaller than the resolution will not become visible) are y 2 
minimization with regularization [71] and Bayesian unfolding [721 [73] ■ X 2 minimization 
allows to find the true spectrum by minimizing a x 2 function with a regularization term. 
These regularization schemas reduce fluctuations, e.g., by preferring solutions with small 
sums of the first or second derivatives, or by maximizing the entropy. Their influence 
has to be carefully studied to keep the bias on the unfolded solution small [H] . Bayesian 
unfolding is an iterative method based on Bayes' theorem which implictly regularizes 
the solution by limiting the number of iterations [75J. Both methods are described and 
evaluated in detail in [76] . 

Clearly, for the true distribution in full phase space only even numbers of particles 
occur due to charge conservation. Due to efficiency and acceptance effects in the 
measured spectrum also odd numbers occur. This has to be taken into account in 
the detector response matrix, i.e., every column corresponding to an odd number of 
generated particles is empty; for the number of measured particles even and odd values 
occur. For the correction to limited phase space this constraint does not arise. 
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Figure 4. Illustration of the problem with simple matrix inversion. The left panel 
shows a sample of the measured distribution M with 10 000 entries (histogram). Using 
Eq. (|35p the corresponding true distribution T is calculated, which is shown in the 
right panel (histogram). The overlaid function is the true shape T. Although the 
resolution effect on the shape of the measured distribution (left) is very small, the 
solution obtained by matrix inversion suffers from large fluctuations. The regularity is 
explained by the fact that the response matrix only contains entries on the diagonal 
and directly next to it. 



3.2. Data Sample 

The data sample considered in this review is summarized in Table [TJ Details about the 
different analyses are given ordered by detector and collider. Unless otherwise stated, 
the correction procedures described in the publications consider the effect of decays of 
strange and neutral particles as well as the production of secondary particles due to 
interactions of primary particles with the material. 

Multiwire proportional chambers inside the Split Field Magnet detector 
(SFM) [87J at the ISR measured the multiplicity distribution for NSD and inelastic p+p 
events at y/s = 30.4, 44.5, 52.6, and 62.2 GeV [TO]. Between 26 000 and 60 000 events 
were collected for each of the energies. The SD component was removed from the sample 
by means of its topology: events are considered SD if in one of the hemispheres no track 
or only one track carrying 80% of the incident proton's energy is found. Systematic 
errors have been evaluated and include the error that arises from the corrections and in 
the low-multiplicity region from the subtraction of elastic events. 

A detector based on streamer chambers [13] at the ISR measured pseudo- 
rapidity and multiplicity distributions for inelastic events at centre-of-mass energies of 
23.6, 30.8, 45.2, 53.2, and 62.8 GeV. Between 2 300 and 5 900 events were measured for 
each energy. In the analysis corrections for the acceptance, the low-momentum cut-off 
(about 45MeV/c), and secondary particles are taken into account. 
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Table 1. Listed are the references of data used in this chapter. For each reference it 
is specified at which energy the sample was taken, to which event class it is corrected, 
and what kind of data (dN^/drj and/or multiplicity distribution) are presented. 



Experiment 


Ref. 


Energy 


dN ch /dr] 


Mult. 


Remark 


SFM 


m 


30.4, 44.5, 52.6, 62.2 GeV 




X 


a 






(INEL, NSD) 








Streamer 


m 


23.6, 30.8, 45.2, 53.2, 62.8 GeV 


X 


X 




Chambers 




(INEL) 








Detector 












UA1 




[77] 




200, 500, 900 GeV (NSD) 




X 






[78j 


540 GeV (NSD) 


X 






UA5 


m 


53 GeV (INEL) 


X 


X 


b 




m 


53, 200, 546, 900 GeV (INEL, NSD) 


X 








m 


546 GeV (INEL, NSD) 


X 


X 


c 




m 


540 GeV (NSD) 




X 






[35 




540 GeV (NSD) 




X 


d 




[36 




200, 900 GeV (NSD) 




X 


e 




m 


200, 900 GeV (NSD) 




X 




P238 


m 


630 GeV (NSD) 


X 






CDF 


m 


0.63, 1.8 TeV (NSD) 


X 








m\ 


1.8 TeV (NSD) 




X 


f 


E735 




PI 




0.3, 0.5, 1.0, 1.8 TeV (NSD) 




X 


g 




[85J 


0.3, 0.5, 1.0, 1.8 TeV (NSD) 




X 


h 




m 


1.8 TeV (NSD) 




X 





a Error of cross section included in multiplicity distribution. 
b Comparison p + p vs. p + p\ only uncorrected multiplicity. 
c Comprehensive report. 

d Multiplicity distribution forced to be of NBD shape. 

e This data is not used here because the method has been partially revised in [T3] . 
f No systematic error assessment. 

g Method description very limited; extrapolated from \v\ < 3.25 to full phase space. 
h Only in KNO variables; no systematic error assessment. 

The UA1 (Underground Area 1) experiment measured the multiplicity 
distribution for NSD events in the interval \rj\ < 2.5 at ^/s = 200, 500, and 900 GeV [77]. 
188 000 events were used, out of which 34% were recorded at the highest energy. The 
SppS was operated in a pulsed mode where data were taken during the energy ramp 
from 200 GeV to 900 GeV and vice versa. Therefore the data at 500 GeV are in fact 
taken in an energy range from 440 GeV to 560 GeV. Only tracks with a pt larger than 
150 MeV/c are considered for the analysis to reduce the contamination from secondaries. 
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Although not explicitly mentioned in the publication, it is assumed for this review 
that the low-momentum cut-off correction is part of the acceptance correction. UAl 
quotes an overall systematic error of 15%: contributions are from strange-particle decays, 
photon conversions and secondary interactions (3%), as well as the uncertainty in the 
acceptance (4%). Other contributions arise from the selection criteria and uncertainties 
in the luminosity measurement (10%). The luminosity measurement uncertainty only 
applies to the cross section measurement, not to the normalized distribution. The 
uncertainty due to the selection criteria is not quoted. Therefore, assuming that the 
systematic uncertainties were summed in quadrature, this uncertainty is 10% and the 
overall systematic error without the uncertainty on the luminosity is 11% which is the 
value applicable to the normalized multiplicity distribution. 

UAl measured the diV c h/d?7 distribution at y/s = 540 GeV [78]. The analysis used 
8 000 events that have been taken without magnetic field which reduced the amount 
of particles lost at low-momenta to about 1%. The systematic error of the applied 
corrections is estimated by the authors to be 5% without elaborating on the different 
contributions. 

The UA5 (Underground Area 5) experiment was running at the ISR and 

the SppS. A comparison of data taken in p + p and p + p collisions at y/s = 53 GeV 
was made [79] • 3 600 p + p events and 4 000 p + p events were used. Trigger and vertex- 
finding efficiencies as well as acceptance effects have been evaluated with a Monte Carlo 
simulation tuned to reproduce ISR data. For both collision systems the comparison of 
the dA c h/d?7 distribution was done using the uncorrected data and limited to events with 
at least two tracks. In this way the authors attempted to achieve lower systematic errors 
on the result. A ratio of 1.015±0.012 (p+p over p+p) has been found. Furthermore, the 
multiplicity distributions were compared. The authors conclude that the distributions 
agree within errors and that differences between p + p and p + p collisions are smaller 
than 2%. 

UA5 measured the dN ch /dr] distribution at \fs = 200 and 900 GeV for NSD events 
PH EI]. 2100 (3 500) events have been used for the analysis at 200 (900) GeV. It should 
be noted that the corrections are based on a Monte Carlo simulation that has been 
tuned to reproduce data measured at \fs = 546 GeV. The results of the simulation were 
parameterized and scaled to \fs = 200 and 900 GeV in order to estimate the corrections 
for acceptance and contamination by secondaries. The authors only mention statistical 
errors explicitly. 

Measurements of the multiplicity distribution have been presented in [821 [351 
l36l IT5| l8~T] . The distribution is measured in different 7/-regions (smallest: \r]\ < 0.2 
for 540 GeV and \r)\ < 0.5 for 200 and 900 GeV) up to \r)\ < 5.0. Furthermore, the 
result is presented extrapolated to full phase space. The analysis used 4 000 events 
for 200 GeV and 7 000 events each for 540 and 900 GeV. In all cases the unfolding 
of the measured spectrum was performed by minimizing a x 2_ f unc tion. For the case 
of \fs = 540 GeV [35J it was required that the resulting function is a NBD which is 
regarded as a strong constraint. This has to be taken into account when interpreting 
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the result at 540 GeV. The distributions at 200 and 900 GeV were unfolded using the 
maximum-entropy method [15] which is considered to be a less restrictive assumption. 
The assessment of the systematic errors is not very comprehensive and an uncertainty 
of about 2% is quoted. 

A Forward Silicon Micro- Vertex detector that was tested in the context of 
a proposed hadronic B-physics experiment (P238) measured the dN^/drj distribution 
at forward rapidity at ^/s = 630 GeV [83J. A sample of 5 million events is corrected for 
tracks from secondaries (2%) and SD events (0.5%). Acceptance and resolution effects 
are corrected using Monte Carlo simulations tuned to UA5 data. Their magnitude as 
well as the magnitude of the trigger- and vertex-efficiency correction are not detailed. A 
normalization error of 5% dominates the systematic error. It is attributed to inconsistent 
results when only the x or y tracking information is used compared to the case where 
both of them are used. Other effects such as detector efficiency misalignment, and the 
SD cross section are considered by the authors not to significantly contribute to the 
systematic uncertainty. 

The CDF (Collider Detector at Fermilab) experiment [88], a detector at 
the Tevatron collider, measured the dN^/drj distribution at = 630 GeV and 1.8 TeV 
with their so-called Vertex Time-Projection Chambers (VTPCs) [83]. These VTPCs 
have been replaced after years of operation by a silicon detector. The authors do not 
mention whether the corrections correspond to NSD or inelastic events. However, the 
trigger configuration requires a hit on both sides. This points to the fact that the trigger 
is insensitive to the majority of SD events. Furthermore, the authors compare their 
measurement to NSD data from UA5 which confirms that CDF obtained their result in 
NSD events. 2 800 (21000) events have been used for the analysis at 630 (1800) GeV. 
Only events with at least 4 tracks are considered to reduce beam-gas background. The 
authors state that they "do not correct for events missed by the trigger or selection 
procedure" and estimated that the selection procedure misses (13 ± 6)% of the events. 
This is surprising because the normalization for dN c ^/dr] would be significantly wrong 
if this correction was not applied. This is not the case shown in the comparison to 
UA5 data. Tracks with p? < 50MeV/c are not found due to the magnetic field and a 
correction of (3 ± 2)% is applied to account for this loss. A systematic error assessment 
is made; the error is dominated by uncertainties in the tracking efficiency and ranges 
from 3% (at 77 = 0) to 15% (at \r}\ = 3.25). 

CDF measured the multiplicity distribution in various ^-intervals for NSD events 
at y/s = 1.8 TeV [16] . The publication does not mention the number of events used in 
the analysis. A systematic-error assessment is reported to be ongoing, but has not yet 
been published. It is unclear if an unfolding method was used. 

A further multiplicity distribution measurement based on a large event sample is 
in preparation by CDF [89]. This study considers only tracks with a pr larger than 



The E735 experiment [85] at the Tevatron collider measured the multiplicity 

distribution of NSD events at energies of y/s = 0.3, 0.5, 1.0, and 1.8 TeV |53j. 




0.4GeV/c. 
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The extrapolation to full phase space has been done by the authors based on 
Pythia simulations. They provide no further information about the statistics used, the 
corrections, and in particular the question as to whether an unfolding procedure was 
used. This has to be taken into account when the result is interpreted. 

In |85j multiplicity distributions of NSD events are presented in intervals of 
\r)\ < 1.62 and \r)\ < 3.25 as well as extrapolated to full phase space for the four 
aforementioned energies. A total number of 25 million events is mentioned, however only 
a subset is used for the multiplicity analysis whose size is not mentioned. The results are 
only presented in KNO variables. The data were unfolded using the maximum-entropy 
method. A systematic error assessment has not been performed. 

Reference [86] presents multiplicity distributions in \rj\ < 1.57, \r]\ < 3.25, and 
extrapolated to full phase space at y/s = 1.8 TeV of NSD events. About 2.8 million 
events have been used and unfolded with an iterative method similar to the mentioned 
Bayesian unfolding. Systematic uncertainties have been evaluated concentrating on the 
effect of the cuts to reduce contamination by single-diffractive and beam-gas events. In 
|85] and [86] a correction for strange-particle decays is not explicitly mentioned, but it 
can be assumed to have been part of the Monte Carlo simulation used to obtain the 
correction factors. 

3.3. Multiplicity Distributions from y/s = 20 to 1800 GeV 

In the following sections the theoretical and phenomenological concepts introduced in 
the first part of the review are applied to selected multiplicity distributions. An example 
for KNO scaling as well as the fit with a single NBD and a combination of two NBDs is 
shown. The available distributions in full phase space are shown together in multiplicity 
and KNO variables to assess the validity of KNO scaling, which is further discussed in 
Section 13.61 

Figure |5] shows multiplicity distributions in full phase space for NSD events taken 
at the ISR. The distribution is shown in multiplicity and KNO variables. The latter 
indicates that KNO scaling is fulfilled at ISR energies (the moments of these distributions 
are analyzed further below). 

Multiplicity distributions are described by NBDs up to y/s = 540 GeV in full 
phase space as well as in different 77-ranges. This behaviour does not continue for 
y/s = 900 GeV. Figure |6] shows multiplicity distributions together with NBD fits in 
increasing pseudorapidity ranges at 900 GeV (top left panel). The respective normalized 
residuals are also shown (top right panel). The NBD fit works very well for the interval 
1 77 1 < 0.5, but it becomes more and more obvious with increasing grange that the region 
around the most probable multiplicity is not reproduced. The structure found around 
the peak gave rise to the two-component approach, discussed previously, in which the 
data are fitted with a combination of two NBDs. The bottom left panel of Figure [6] 
shows these fits with Eq. (1281) . and normalized residuals (bottom right panel) to the 
same data which yields good fit results for all pseudorapidity ranges. 
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Figure 5. KNO scaling at ISR energies. The figure shows normalized multiplicity 
distributions for NSD events in full phase space vs. multiplicity (left panel) and using 
KNO variables (right panel). The data were measured by the SFM [TP] . 



To assess the validity of KNO scaling all available multiplicity distributions are 
drawn as function of the KNO variable z = ^-^/(N^). This is shown in Figure for 
NSD events in full phase space from 30 to 1800 GeV. Although it is evident that the 
high- multiplicity tail does not agree between the lowest and highest energy dataset, no 
detailed conclusion is possible for the data in the intermediate energy region. Further 
conclusions are derived from the study of the moments of these distributions, see 
Section 13.61 

3.4. dN ch /di] and (N ch ) vs. y/s 

Figure |8] shows dN^/dr) at energies ranging over about two orders of magnitudes, from 
the ISR (y/s = 23.6 GeV) to the Tevatron (CDF data, = 1.8 TeV). Increasing the 
energy results in an increase in multiplicity. The multiplicity of the central plateau 
increases together with the variance of the distribution. Note that the data points at 
the lowest energy are for inelastic events, the other data points refer to NSD events. 
We recall that the dip around rj rs is due to the transformation from rapidity y to 
pseudorapidity 77. 

The left panel of Figure [9] shows dA / " ch /d^| r)=0 as a function of \/s. Closed symbols 
are data for inelastic events; open symbols for NSD events. dV c h/d?7| r)= o increases with 
increasing \fs violating Feynman scaling. Two fits are shown for the NSD data: a 
fit with a + 61ns (a = —0.308,6 = 0.276, solid black line) and a + 6 In s + cln 2 s 
(a = 1.347,6 = -0.0021, c = 0.0013, dashed red line). Due to the fact that different 
published values include different errors, e.g., no systematic errors for the UA5 data, 
the errors are not used for the fit. The Ins dependence was used to describe the data 




Figure 6. Normalized multiplicity distributions of NSD events at yfs = 900 GeV 
in various rapidity intervals are shown fitted with single NBDs (top left panel) 
or a combination of two NBDs (bottom left panel). The two contributing NBDs 
(dashed lines) are shown exemplarily for \rj\ < 3.0 and 5.0. The right panels show the 
normalized residuals with respect to the corresponding fits defined by (l/e)(P(7V c h) — 
fit) with e being the error on P(N c \ l ). These are smoothed over four data points to 
reduce fluctuations. The data were measured by UA5 [T5] . 



at centre-of-mass energies below 1 TeV. Data at a higher energy from CDF showed 
deviations from this fit [El]. The additional In 2 s term yields a better result (x 2 / n df 
reduced by about a factor of two, although the x 2 definition is not valid without using 
the errors); this fit suggests that diV c h/d?7| r)= o increases faster than Ins. The fits are 
extrapolated up to the nominal LHC energy of \fs = 14 TeV. 

The right panel of Figure shows the average multiplicity (N^) as a function of 
yfs. Data are shown for full phase space and for a limited rapidity range of |?7| < 1.5. 
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Figure 7. Multiplicity distributions of NSD events in full phase space in multiplicity 
variables (left panel) and in KNO variables (right panel) . Data points from [TQl [15j HH 
[53]. 
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Figure 8. d7V ch /dr/ at different y/s. Data points from [151 [751 |55j H5 1 [81 1 1 



In publications two different approaches are found to obtain average values in a limited 
77-range. The first uses a normalization to all events having at least one track in 
the considered phase space. The second approach uses a normalization to the total 
considered cross section (inelastic or NSD) including events without any particle in 
the considered range (data shown here). While the latter is the more evident physical 
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Figure 9. dA^ch/d^j^^o (left panel) and (N c h) vs. y/s in full phase space and \r/\ < 1.5 
(right panel) as a function of y/s. Data points from [12l [90l H31 LZU [82l UHl ESI [El EH 

isnTanaisi. 



observable, the former does not depend on the efficiency to measure the total cross 
section which renders it less dependent on model assumptions used in the evaluation of 
the trigger efficiency. Data from bubble chambers at low y/s are included in Figure El 
from the Mirabelle chamber at Serpukhov, Russia [12] and from several bubble chambers 
at FNAL [90]. 

Both sets of NSD data in full phase space are fitted with four different functional 
forms. For full phase space the logarithmic dependence does not reproduce the data and 
is only shown to demonstrate the violation of Feynman scaling; the form a +6 In s + cln 2 s 
fits the data well (a = 16.65,6 = — 3.147, c = 0.334). The form a + bs 1 ^ inspired 
by the Fermi-Landau model [9T1 192] provides a reasonable fit with a = 5.774 and 
b = 0.948. However, since the parameter a in this form is related to contributions 
to the multiplicity from the leading particles it is not expected to be much larger than 
two. Hence, one can conclude that the Fermi-Landau form (N c ^) ~ s 1//4 fails to describe 
the p + p data. The form a + bs n ([93]) provides a good description of the data with 
a = 0, b = 3.102, n = 0.178. At y/s = 14 TeV the different fits differ significantly so that 
measurements at the LHC will easily reject inadequate parameterizations. 

3.5. Universality of Multiplicities in p + p(p) and e + e~ 

Charged-particle multiplicities in e + e~ collisions are found to be larger than the 
multiplicity in p + p(p) collisions at the same centre-of-mass energy (see Figure [TO]) . 
The multiplicities in e + e~ and p + p(p) collisions become strikingly similar when the 
p + p{p) points are plotted at half their collision energy [911 ES]- This leads to the 
concept of an effective energy E e g in p + p(p) collisions available for particle production 
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[93J ESI EH EB]- This concept emerged already in the 1970s in the study of high-energy 
cosmic rays [99] . In this picture the remaining energy is associated with the two leading 
baryons which emerge at small angles with respect to the beam direction: 

E C S = y/s — (-Eiead,l + ^lead^! (E c s) = V« — 2 (Evading) • (37) 

It has been speculated that E eS or correspondingly the inelasticity K = E eS /\/s are 
related to the 3-quark structure of the nucleon [100[ 11011 1102] . In this simple picture 
the interaction of one of the three valence quarks in each nucleon would correspond to 
an average inelasticity (K) ~ 1/3. 

Here we estimate the coefficient of inelasticity K of a p+p(p) collisions by comparing 
p + p(p) with e + e~ collisions. Given a parameterization f e e{\fs) of the \/s dependence 
of (iV c h) in e + e" collisions one can fit the p + p{p) data with [100] 

f PP (V~s) = fee(K-^) + n . (38) 

The parameter no corresponds to the contribution from the two leading protons to the 
total multiplicity and is expected to be close to n = 2. 

To parameterize the multiplicity data in e + e~ collisions we use the analytic QCD 
expressions of Eq. (1291) . The strong coupling constant a s was fixed at the Z mass to 
a s (M|) = 0.118 leaving A and Alphd as fit parameters. The second form is from a 
3NLO calculation [UE2] where the normalization and the A parameter in the expression 
for a s were taken as fit parameters. Both forms yield excellent fits of the e + e~ data 
and essentially provide the same extrapolation for \fs > 206 GeV where no data are 
available. 

A fit with Eq. ( 138]) describes the p + p(p) well and yields K = 0.35 ± 0.01 and 
uq = 2.2 ± 0.19. The fraction of the effective energy, the inelasticity, is studied in more 
detail in the right panel of Figure [10J The inelasticity K is determined for each p + p(p) 
point by solving 

f ee {K^fs~p~ v - Am) = (N ch ) pp - n . (39) 

In the simple quark-scattering picture the offset Am takes the contribution of the masses 
of the two participating constituent quarks to the centre-of-mass energy into account. 
Depending on the values for Am and the leading particle multiplicity n different 
inelasticities can be defined. In Figure ITOl the three cases K\ (n = 0, Am = 0), K 2 
(no = 2.2, Am = 0), and K% (uq = 2.2, Am = 2/3m proton ) are shown. The inelasticity 
Ki decreases from ~ 0.55 — 0.6 at ISR energies to 0.4 at y/s = 1.8 TeV. The inelasticities 
K2 and K3 appear to be energy independent at ~ 0.35, in remarkable agreement with 
the expectation of 1/3 in the simple quark-scattering picture. 

The similarity between (A^ c h) in e + e~ and p + p(p) collisions when the effective 
energy is taken into account raises the question as to whether these similarities still 
persist in more differential observables like rapidity distributions. Note that a remarkable 
similarity was observed between dA" c h/d?7 per participating nucleon pair in central 
Au+Au collisions at y / J/v r /v = 200 GeV and in e + e~ collisions at y/s = 200 GeV [94] . 
This suggests that the effective energy in central Au+Au collisions is close to 100% of 
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Figure 10. Left panel: Comparison of charged particle multiplicities in p + p(p) and 
e + e~ collisions (e + e~ data taken from the compilation in [61]). Note that NLO QCD fit 
(solid gray line) and 3NLO QCD fit (dashed line) of the e + e~ data are almost identical 
and lie on top of each other. Right panel: The inelasticity in p + p(p) calculated for 
three different assumptions. The ^fs dependence of the inelasticity assumed in the 
theoretical study |103] is shown for comparison. 



the beam energy, most likely due to the multiple interactions of the nucleons. In the 
left panel of Figure [TT1 dN r ^ / dr/ distributions from p + p(p) collisions are compared with 
rapidity distributions dN^/dyr with respect to the thrust axis from e + e~ collisions. 
Datasets are compared for which y/s^ m (2 3) v /s^T. For the shown cases the dN ch /drj 
distribution in p+p(p) are broader than the dN ch /dyT distributions. This might indicate 
the contribution from beam-particle fragmentation in p+p(p). Note, however, that based 
on the Landau hydrodynamic picture a simple relation between dN^/dr}^^ and 
dNch/dyxlULo^^ was suggested in [102[ 1104] . The width A of the distribution defined 
as A = (iV c h)/dA^ c h/d?7|^ = o and A = (N(±) /dN c h/dyT\ yT =o, respectively, is shown in the 
right panel of Figure [TTJ Based on the QCD calculation in |105] A is expected to scale 
linearly with y/his. As shown in Figure fTTl this form does not describe the p + p(p) data 
which are well parameterized with A = a + 6 Ins. The Landau hydrodynamic model also 
predicts a linear vms dependence of A [1061 11071 1108] and hence also fails to describe 
the p + p{p) data. 

It will be interesting to see whether this universality of multiplicities in e + e~ and 
p + p(p) collisions also holds at LHC energies. This universality appears to be valid 
at least up to Tevatron energies despite its rather weak theoretical foundation (see 
Section 12. 61) . Under the assumptions that Ki remains constant at about 0.35 also at 
LHC energies and that the extrapolation of the e + e~ data with the 3NLO QCD form 
is still reliable at y/s 5 TeV one can use the fit of p + p(p) data to predict the 
multiplicities at the LHC. This yields (iV ch ) w 70.9 at 7 TeV, (N ch ) w 79.7 at 10 TeV 
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Figure 11. Left panel: Comparison of r\ (p + p(p)) and yr distributions (e + e~) at 
different energies. The variable yx is the rapidity with respect to the thrust axis of 
the e + e~ collision. Right panel: The width A of the r\ distributions (p + p(p)) and yx 
distributions (e + e~) as a function of y/s. Note that the difference between inelastic 
and non-single diffractive collisions is neglected by fitting the combined p + p(p) data 
with A = a + 61n - v /i. In case of the Landau model (iV c h) /(diV c h/dy| y= o) = \/2ttL where 
L = \n{y/s/{2m p )) is shown. Data points for e+e" from [531 [TMl [LTUl [TTT1 [TT21 151 fTPg ] . 



and (N ch ) « 88.9 at 14TeV. Extrapolating the ratio A = (N ch ) /(dN^/dr/)^ with the 
form A = a + 6 In y/s (see Figure fTTl) these multiplicities correspond to dN^J dr/l^o ~ 5.5 
at 7TeV, dN ch /dr]\ v=0 w 5.9 at lOTeV and dN ch /dr]\ v=0 w 6.4 at 14TeV. 

5.5. Moments 

The moments of the multiplicity distributions as defined in Section 12.21 will now be 
used to identify general trends as function of y/s and to study the validity of KNO 
scaling. First the reduced C-moments, Eq. fl6]), are studied. The left panel of Figure fT2l 
shows C 2 to C 5 from y/s = 30 to 1800 GeV. These have been calculated from the 
available multiplicity distributions and are consistent with published values where 
available. However, for the ISR the uncertainties are overestimated due to the fact 
that the uncertainties on the normalized distributions include the uncertainty on the 
cross section. At lower energies data from bubble-chamber experiments show that the 
moments are constant (see e.g. [36] for a compilation). In the right panel a constant is 
fitted to the data points from the ISR. This emphasizes that for y/s larger than at ISR, 
the moments increase significantly with energy. 

However, as argued in Section 12.31 the conclusion about constant C-moments 
follows from KNO scaling only in an approximation. Therefore the behaviour of factorial 
moments is analyzed. Exemplarily F 2 and _F 4 are shown in the left panel of Figure [TBI 
compared to their C-moments counterparts. Also these increase with increasing y/s. 
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Figure 12. The reduced C-moments C2 to C5 are shown at different y/s for 
distributions of NSD events in full phase space. The right panel shows a zoom. The 
lines are constant functions fitted to the low-energy data points from the ISR. The 
data are from QJ2 [HI EH [53] . 
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Figure 13. Left panel: C- and _F-moments at different y/s. The lines are constant 
functions fitted to the low-energy data points from the ISR. Right panel: Influence of 
high-multiplicity bins on C-moments. The moments C4 and C5 are shown once using 
all bins for the calculation and once only bins that contain at least 1% of the topological 
cross section. The lines are constant functions fitted to the low-energy data points from 
the ISR. Data from Q3H HH |33 [S3] . 



Both, C- and F-moments, show an increase with y/s and allow the same conclusion 
about the validity of KNO scaling. 

It is important to note the influence of the tail of the distribution, i.e., of bins 
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Figure 14. Ratio of average multiplicity and dispersion as a function of yfs for p+p(p) 
and e+e" data. Data from [TQl HSJ EH [53] (p + p(p)) and [333 HUH QUI TH [TT8l 
11191 11201 11211 1122) (e + e~). More p + p data points at lower energies are shown in [2]. 



at high multiplicity, on the moments; especially on the higher ones. The right panel 
of Figure [13] compares C4 and C5 calculated from a subset of bins, excluding the ones 
that are below 0.01 (i.e. less than 1% of events occur at this multiplicity) which mainly 
excludes high-multiplicity bins, with the values calculated with all bins. The value of 
0.01 is approximately the smallest bin content in the data from ISR. The difference 
is significant which shows that the moments will change if more events are collected 
at a given energy. Nevertheless, we see the increase of the moments with y/s although 
less pronounced. One may ask of course why the moment calculated with all bins and 
the moment calculated from the subset do not agree within uncertainties. This is due 
to the fact that for all bins without entries an uncertainty of is assumed which is 
incorrect. Assuming a Poisson distribution in each bin (with an unknown mean), a 
bin with no measured entries has an upper limit of 2.3 at 90% confidence level (see 
e.g. |114j ). However, following this strictly would mean to assign this error for all bins 
without entries up to infinity. Consequently, also the uncertainty on the moments goes 
to infinity. 

In Figure HUtlie ratio of the average multiplicity (n) and the dispersion D is shown. 
It is constant when KNO scaling holds [30]. Results for e + e~ are shown in addition to 
the p + p(p) data. For p + p(p) the ratio is clearly not constant, while it is approximately 
constant for e + e~ albeit with significantly larger errors. At the same y/s the multiplicity 
distribution in p + p(p) is significantly broader than in e + er . 

In summary, the C- and F-moments increase with y/s, even considering the 
influence of high-multiplicity bins. Furthermore, (n)/D is not constant. These facts 
clearly demonstrate that KNO scaling is broken. 
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Figure 15. Multiplicity distribution at y/s — 30.4 GeV measured at the ISR [10] and 
y/s = 900 GeV measured by UA5 [15] fitted with a single NBD (both panels) and a 
combination of two NBDs (right panel). 



CDF has addressed the question as to whether the violation of KNO scaling is 
related to a special class of events |123j . They use events at 1.8 TeV and only tracks 
with a pt above 0.4GeV/c. Here, a weak KNO scaling violation is reported in \rj\ < 1.0. 
Furthermore, when they divide their data sample into two parts, they can confirm KNO 
scaling for the soft part of their events and at the same time rule it out for the hard 
part. In |123] soft events are defined as events without clusters of tracks with a total 
transverse energy above 1.1 GeV, regarded as jets. 

Two further interesting features are observed together with the onset of KNO 
scaling violations [77]: the average transverse momentum that was about 360MeV/c 
at ISR energies starts to increase. Furthermore, a y/s dependent correlation between 
the average-pr and the multiplicity is measured. Both observations point to the fact 
that the influence of hard scattering becomes important at these energies. 

As mentioned earlier, higher-order QCD calculations predict oscillations of the H- 
moments as function of the rank. The uncertainties of moments increase with the rank 
(see e.g. Figure [12] for C-moments); this fact applies also to the if- moments. The search 
for oscillations requires the calculation of moments up to ranks of 10 - 20. The data 
studied in this review allow to calculate these moments only with large uncertainties. 
There are indications of oscillations but definite conclusions require a deeper study and 
distributions with large statistics that can hopefully be obtained at the LHC. 

3. 7. NBD Parameters (N ch ) and k~ l 

Fitting the multiplicity distribution with a single NBD is satisfactory up to about 
540 GeV; at 900 GeV deviations become clearly visible. Distributions at larger y/s can 
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Figure 16. Left panel: Parameters of a single NBD fit and corresponding /ndf. 

and x 2 /ndf are scaled for visibility. Right panel: Parameters of a single NBD fit 
compared between p + p (fits performed here) and e+e - (from [1 151 I117| fT7]). The 
area between the dashed lines corresponds to the predictions for 1/k from [63] . 



be successfully fitted with a combination of two NBDs. 

Figure [15] shows exemplarily multiplicity distributions from ISR and UA5 fitted 
with a NBD. While in the former the NBD reproduces the shape very well, in the latter 
structures (especially around the peak) are visible that are not reproduced by the fit. 
Interestingly the x 2 /ndf of the fit at y/s = 900 GeV is still good (see the left panel of 
Figure ITS]) . 

In Figure [16] (left panel) the obtained fit parameters (n) and k~ x are shown for 
datasets in full phase space at different yfs as well as the x 2 /ndf of the fits. The shown 
X 2 /ndf for the data from the ISR is underestimated because as previously mentioned 
the uncertainties on the normalized distributions include the uncertainty on the cross 
section. This uncertainty has two components, one applicable to the measurement at 
a given yfs and one global scale uncertainty |124j . Adding these linearly and removing 
them from the uncertainty of the normalized distribution leads to an increase of the 
X 2 /ndf of about 25%. The average multiplicity (n) increases linearly with In^/s like 
it was already discussed in Section 13.41 increases with y/s and can be fitted with 
a function of the form a + feln-^/i. KNO scaling corresponds to a constant, energy- 
independent k. Figure [16] (right panel) compares k~ l from p + p and e + e~ data. Both 
can be fitted with the same functional form, but the values for e + e~ are generally lower, 
indicating a narrower distribution. An extensive compilation of k~ x in p + p and e + e~ 
collisions can be found in |6[ Figure 2.5]. For e + e~, this compilation includes k" 1 at 
lower energies. It is argued in [6] that for LEP energies k~ l tends to flatten, i.e., that 
the KNO scaling regime is reached. A discussion about the parameters of NBDs fitted 
to p + p data can also be found in [125] . 



Charged- Particle Multiplicity in Proton-Proton Collisions 



35 



3.8. Two NBD Fits 

Deviations between the multiplicity distribution and the fit with a single NBD are 
found at highest SppS energies. The combination of two NBDs (Eq. ( 1281) ) yields better 
agreement with the data. Both fit attempts are shown in the right panel of Figure [15] 
for y/s = 900 GeV. Fits with two NBDs can be performed unconstrained or following 
an approach that constrains the parameters as, e.g., suggested in [15] . 

In [45] first the average multiplicity of the soft component {n) so { t using only data 
below y/s = 60 GeV and the total average multiplicity (n)totai using available data up 
to y/s of 900 GeV are fitted. A logarithmic dependence is assumed for (n) so f t , while 
additionally for (n) to tai a ln 2 -term is added. 

Following the assumption based on a minijet-analysis by UAl [45] that the semi- 
hard component has about twice the average multiplicity than the soft component, a 
can be calculated from (n) so f t and (n) total- Two variants are considered, variant A in 
which (n) S e m i_h a rd = 2(n) soft , and variant B with (n) semi _ hard = 2(n) soft + 0.1 In 2 y/s. 

The parameter fc so f t is found to be rather constant between 200 and 900 GeV and 
thus set to k so{t = 7. Three scenarios are then presented in [15] for the extrapolation to 
higher energies. The first assumes that KNO scaling is valid above 900 GeV (fc S emi-hard ~ 
13). Scenario 2 fits /c to tai with: 

k£L = a + b\nyfi. (40) 

Scenario 3 fits a next-to-leading order QCD prediction to fc sem i-iiard : 

^semi-hard ~ « ~ \/WHV^/Qo) ■ (41) 

Note that in the original publication [45, Eq. (12)] Eq. ( 14 ip is incorrectly printed but 
used correctly in the calculations and figures. The correct formula can be found in [5]. 
The free parameters a, b, Qq are then found by fitting the data. These three scenarios 
(1-3) can be combined with the aforementioned variants A and B, resulting in a total 
of six possibilities. Here we restrict ourselves to only three of them (1-3 combined with 
A). 

Figure [TT] shows the functional forms found in [45]. These are compared to the 
unconstrained results obtained from fitting the distributions with Eq. (1281) . Note that 
only the data from y/s = 200 GeV to 900 GeV were used to fit the functional forms 
in [45]. There are clear differences, e.g., at 200 GeV for (n) semi _h a rd and at 540 GeV for 
^semi-hard- O ne a ^ so observes large errors for certain fits showing that several solutions 
with similarly small ^/ndf exist. The ^/ndf is, as expected, generally better for 
unconstrained fitting. In several cases the x 2 /ndf is significantly lower than 1 which is 
unexpected and might be attributed to the requirement of smoothness in the unfolding 
procedure. 

At 1.8 TeV the fraction of soft events is much larger in the unconstrained fit; also the 
other fit parameters do not follow the extrapolations. Consequently, for the constrained 
fit, the x 2 fndf is very large at this energy, the fit is not very good. Figure [TBI shows the 
multiplicity distribution in full phase space at y/s = 1.8 TeV compared to the predictions 
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Figure 17. The functional forms (lines) found in [45] are compared to unconstrained 
fits of all five parameters (points). In addition the \ 2 /ndf is shown. The data at largest 
y/s are from E735 [53] ; the others are from UA5 [HI HI] . A and B in the legends refer 
to variants A and B in 01] (see text). 




Figure 18. Comparison between the predictions of the two-component model [45] with 
the E735 measurement in full phase space at \/s — 1.8 TeV [53]. The right panel shows 
normalized residuals between data and the predictions. 



of this model. Only scenario 3 follows the general trend of the distribution. However, 
none of the curves reproduces the distribution in detail. 

Figure [19] shows a comparison of predictions of this model (using values from the 
authors derived for limited phase space in [126J ) with data from CDF in \r)\ < 1 at 
\/s = 1.8 TeV. Scenario 1 and 3 reproduce the spectrum reasonably well. 
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Figure 19. Comparison between the predictions of the two-component model [126: 
with the CDF measurement in \r/\ < 1 at y/s = 1.8 TeV [15] . The right panel shows 
the ratio between data and the predictions. 



We conclude that unconstrained fits with two NBDs work successfully with a 
reasonable low x 2 / n df for all distributions considered here. However, general trends 
as a function of \/s cannot readily be identified. As an alternative in [45J parameters are 
fixed following certain assumptions resulting in more systematic fit results. However, 
the results are in some cases significantly different from the parameters obtained by the 
unconstrained fits. 

3. 9. Open Experimental Issues 

This section addresses open experimental issues and presents some comparison plots 
between data of disagreeing experiments. 

A direct comparison between UAl and UA5 at y/s = 540 GeV in limited 77-regions 
and in KNO variables shows that the two experiments agree in their confirmation of 
KNO scaling in the interval \rj\ < 0.5. However, they disagree in the interval \rj\ < 1.5, 
but the violation of KNO scaling in the UA5 data is only due to an excess of events 
with z > 3.5, i.e., events that have more then 3.5 times the average multiplicity. This 
comparison has been performed in [77] and is shown for \rj\ < 1.5 in Figure [20] It also 
includes E735 data in \rj\ < 1.62 which shows better agreement with the UA5 data 
in the tail. The band corresponds to the region where data points are taken from the 
original figure which is of poor quality [SHI Figure 2}. It corresponds to data points 
from y/s = 300 GeV to 1.8 TeV. The band most likely overestimates the error bars of 
the single points. Nevertheless, E735 confirmed KNO scaling in \r]\ < 1.62 based on 
their data; this was done only by comparing the distributions in KNO variables and 
not by studying the moments [85J . A final conclusion about the slight KNO violation in 
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Figure 20. Multiplicity distribution of NSD events measured by UA1 and UA5 at 
y/s = 540 GeV in \rj\ < 1.5 shown in KNO variables [781135]. Furthermore, data from 
E735 at s/s = 1.8 TeV in \r]\ < 1.62 is shown [85]. See the text for an explanation of 
the E735 band. 



\t)\ < 1.5 cannot be made at present. 

Reference [53] compares multiplicity distributions in full phase space from E735 and 
UA5 at three different energies (see the top left panel of Figure [21]). The distributions 
disagree especially in their tails. This inconsistency has been frequently quoted [55[ E]. 
However, it is important to note that the data from E735 are extrapolated from \rj\ < 3.25 
to full phase space which may imply a significant systematic uncertainty. Also the data 
from UA5 are extrapolated, in this case starting from \rj\ < 5. This is less of a problem 
since an estimation based on Pythia at = 900 GeV shows that 86% (64%) of the 
particles are emitted in \rj\ < 5 (3.25). A direct comparison of data from E735 and UA5 
in an ^-region where both detectors are sensitive is therefore very interesting. The top 
right and bottom left panel of Figure EE] show such a comparison for similar ^-regions: 
< 1.5 (1.62) and \rj\ < 3.0 (3.25) for UA5 (E735). The bottom right panel of FigureEH 
shows the comparison in full phase space. Due to the fact that the E735 data are only 
available in KNO variables, the UA5 data are shown superimposed in KNO variables, 
too. No scaling correction has to be applied due to the different ^-regions because the 
data are shown in KNO variables and thus already scaled with the average multiplicity. 
It can be seen that in both //-intervals the distributions agree within errors (apart from 
very low multiplicities in the smallest 77-region). The discrepancy appears going to full 
phase space. Hence, a systematic effect in the extrapolation procedure may be suspected 
as the cause of the discrepancy. 

Furthermore, it should be noted that results from CDF and E735 deviate from each 
other in similar phase space regions, see Figure [221 Studying the multiplicity distribution 
in KNO variables shows that CDF results at 1.8 TeV are closer to the UA5 results at 
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Figure 21. Top left panel: Comparison of UA5 and E735 data in full phase space 
at three different energies. Data from [T|>J HH [S3] . Other panels: Comparison between 
UA5 [TS] at y/s — 900 GeV and E735 [55] in approximately equivalent ^-regions (top 
right and bottom left panels) and in full phase space (bottom right panel). 



900 GeV than to the E735 results at 1.8 TeV (plot not shown). 

In summary, there are various experimental inconsistencies, especially in the tail of 
the distributions which has a significant influence on, e.g., the calculation of moments of 
higher rank. It will be interesting to compare data taken at the LHC with the existing 
distributions. 

4. Predictions 

The measurement of the charged-particle multiplicity at the LHC has the potential 
to improve our understanding of multi-particle productions mechanisms by rejecting 
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Figure 22. Data from CDF [16] and E735 [86] at y/s = 1.8 TeV are compared. The 
right panel shows normalized residuals using the error of the E735 data (these have 
been smoothed over 5 data points to reduce fluctuations): the black curve compares 
the experiments directly; the dashed red curve takes the different 77-ranges into account 
by simply scaling the CDF N c h axis by a factor (1.57/1.5). 

models based on incorrect assumptions. In Figure [231 predictions for dN c ^/di]\ v=0 and 
(iVch) in full phase space are summarized. Predictions for diV^/d^l^o range between 
about 4.3 — 6.4 at y/s = 7TeV and 4.5 — 7.8 at y/s = 14 TeV. For the charged multiplicity 
in full phase space the range is about 55 — 75 at y/s = 7 TeV and 65 — 90 at y/s = 14 TeV. 
Measured values outside these ranges would come as a surprise. 

The predictions can be classified into several classes. First there are the simple 
extrapolations of trends observed at lower y/s (CDF [84], Busza |127[ 1128] ). Those 
logarithmic extrapolations from Section [331 that fit the data reasonably well are shown. 
The In 2 y/s extrapolation performed in this review is conceptually identical to the 
extrapolation performed by CDF; however, additional data points from UAl and P238 
have been used in this review which leads to about a 5% lower extrapolated value 
at larger energies than found by CDF. The predictions based on the p + p/e + e~ 
universality discussed in Section 13.51 also belong to this class. Other model predictions 
are based on the assumption of gluon saturation (Armesto, Salgado, Wiedemann |129] 
and Kharzeev, Levin, Nardi |130j ). QGSM is a representative of a class of models for soft 
scattering based on Regge theory and the parton structure of hadrons [131] . In these 
models proton-proton interactions are described in terms of the exchange of colour- 
neutral objects called Pomerons. The multiple-particle production is governed by the 
fragmentation of strings that occur in the cut Feynman diagrams of these processes. 

In many cases it is more practical to implement theoretical ideas in terms of Monte 
Carlo event generators. Phojet [132] is such a generator based on the Dual Parton 
Model [56] whose concepts are similar to the concepts used in QGSM. Based on the 



Charged- Particle Multiplicity in Proton-Proton Collisions 



41 



16 


- n7TeV o-ioTeV • 14TeV 




12 


— n7TeV o10TeV • 14 TeV 




15 
14 


-Pythia, D6T tune (CMS) no • 
-Pythia, ATLAS MC09 tune □ o 


• 


11 


— Pythia, D6T tune (CMS) □ o • 




13 


—Pythia, tune A no* 




10 


— Pythia, ATLAS MC09 tune □ o • 




12 


-Pythia default 


□ o • 


9 


— Pythia, tune A □ o • 




11 
10 


—Epos, with mini-plasma 
—Epos, no mini-plasma 


• 

• 


8 


— Pythia default do* 




9 


— Phojet no* 




7 


— Abramovsky, Radchenko — •— 




8 


-Kaidalov, Poghosyan (QGSM) do* 




6 


— Phojet □ o • 




7 
6 


— Kharzeev, Levin, Nardi • 
— Armesto, Salgado, Wiedemann 


□ o • 


5 


— Kaidalov, Poghosyan (QGSM) □ o • 




5 


-G-O., R. (In 2 \fs extrapolation) □ o • 




4 


— Kharzeev, Levin, Nardi • 




4 


-G-O., R. (In\fs extrapolation) no* 




3 


— G-O., R. (Ir^xJs extrapolation) no* 




3 
2 


-G-O., R. (eV/pp universality) n 
-Busza — •— 


• 


2 


- G-O., R. (e*e"/pp universality) □ o • 




1 


— CDFparam. , , □ o 

i i i r i i i i i i i i i i i i i i i i i i 


I I I I I I I I I I I 


1 


- Busza • ■ 

1 1 1 1 1 1 1 1 i j 1 1 1 1 1 1 1 j 1 1 1 1 1 1 1 1 1 1 1 1 1 1 j 1 1 1 1 1 1 1 1 




1 2 3 4 5 


6 7 8 
dN rh /dnl 


10 20 30 40 50 60 70 80 90 100 
<N ch > full phase space 



Figure 23. Predictions for dNch/drjl^o and (N c h) in full phase space in p+p collisions 
at V« = 7, 10, and 14TeV. For dN ch /di]\ n=0 predictions 6 and 7 [IH [130] it is 
not explicitly stated whether the predictions are for inelastic, NSD, or non-diffractive 
collisions; all other predictions (EH H27j H2H1 CLEO 0121 UM LEU EL] are for NSD events. 



Pomeron picture Phojet accounts both for soft and hard interactions. Epos is another 
event generator that aims at consistently treating soft and hard interactions [133[ 1127] . 
This model has been compared and tested with data from high-energy cosmic rays. 
Epos can be run in a mode which allows the formation of a quark-gluon plasma in 
p + p collisions. In the Pythia event generator [51] the picture of individual parton- 
parton scatterings, which successfully describes high-p^ phenomena, is extrapolated to 
low pt- Pythia has many parameters and several Pythia tunes exist which, e.g., describe 
Tevatron data well. The shown predictions are based on the default Pythia settings and 
three frequently-used tunes: A |134] . D6T |134] . and ATLAS MC09 which are the main 
tunes used by the CDF, CMS, and ATLAS experiments, respectively. Pythia 6.4.14 has 
been used with the structure functions CTEQ6L |135j . The large increase in ^Js from 
the Tevatron to the LHC will unveil whether certain Pythia tunes really capture the 
underlying physics or whether they are just ad hoc descriptions at specific energies. 

Predictions for the multiplicity distribution in full phase space and in a limited 
range of 1 77 1 < 1 are shown in Figure [241 for y/s = 7TeV and in Figures I2"5l and I2T51 for 
14 TeV. Shown are the extrapolations of the two-component model with NBDs for the 
three scenarios (from [4"5 |I126] ). Pythia with the four aforementioned tunes, and Phojet. 
Furthermore, for 14 TeV, a prediction from QGSM |131] and a prediction found in the 
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Figure 24. Predictions for the multiplicity distribution of NSD events at 7TeV are 
shown in full phase space (left panel) and < 1 (right panel). The 2NBD predictions 
are from [45 lll26j . for details see text. 




Figure 25. Predictions for the multiplicity distribution of NSD events at 14TeV are 
shown in full phase space. In addition to the logarithmic view, the right panel shows 
a linear scale and a zoom into the low-multiplicity region. The predictions are from 
[451 HHl [1311 [551 H36], for details see text. 



framework of a multiple-parton interpretation of the collision [55] (only for full phase 
space, see Section 12. 5. 2j) are shown. 

The predictions differ significantly, which is most pronounced in the tails of the 
distributions where the deviation is more than an order of magnitude. This applies in full 
phase space as well as in limited 77-ranges. However, also in the low-multiplicity region 
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Figure 26. Same as Figure [25] but in \t)\ < 1. 

there are clear differences (see the right panels of Figure [25] and 126]) . This difference 
is less pronounced in \r)\ < 1. From \/s = 7TeV to 14TeV the predicted differences 
increase further. 

Multiplicity distribution measurements at larger y/s will allow one to decide which 
models best describe the data. For the specific case of Pythia it is clear that the 
parameter space is very large and several combinations of parameters may describe 
the data equally well. Nevertheless the measurement of the multiplicity distribution 
(together with the p? spectrum and the correlation of (p?) an d the multiplicity) will 
allow, e.g., one to learn about the colour correlations in the final-state [137J. 

5. Summary 

This review summarizes measurements of charged-particle multiplicity distributions and 
pseudorapidity densities in high-energy p + p(p) collisions. Moreover, related theoretical 
concepts have been briefly presented. The multiplicity cannot easily be described within 
QCD because it is related to soft interactions for which the strong coupling constant 
is large and perturbative methods are difficult to apply. The validity of the available 
theoretical descriptions has been assessed using data from collider experiments at centre- 
of-mass energies over about two orders of magnitude, from 23.6 GeV to 1.8 TeV. The 
energy dependence of (N&) and dN c h/ di]\ v=Q shows that Feynman scaling is not satisfied 
at currently available energies; the moment analysis shows that KNO scaling does not 
hold except for very central and small regions of phase space where it is not ruled out 
by the present data. For high energies a single NBD does not fit the data anymore; 
a combination of two NBDs is more successful; however, additional assumptions are 
needed to identify general trends as a function of \/s. Although rapidity and multiplicity 
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distributions differ between p + p(p) and e + e~ collisions, their average multiplicities 
as function of y/s show similar trends that can be unified using the concepts of 
effective energy and inelasticity. Without a correction for the multiplicity related to 
the leading protons the inelasticity K in p + p(p) collisions as defined by comparing to 
the multiplicity in e + e~ collisions decreases from about 0.6 at = 23.6 GeV to about 
0.4 at = 1.8 TeV. Taking this correction into account yields an energy-independent 
inelasticity of K « 0.35. Open experimental issues have been discussed and predictions 
for the LHC energy regime have been enumerated and briefly described. Interestingly, 
models that all more or less describe average multiplicities and multiplicity distributions 
up to Tevatron energies make significantly different predictions for the LHC. 

At the LHC multiplicity measurements together with other global event properties 
will provide input to distinguish between the wealth of different model predictions 
including those from popular Monte Carlo event generators. This will allow the amount 
of possible interpretations of the underlying physics to be reduced. In particular it will 
deepen the understanding of multiple-parton interactions and hadronisation as the LHC 
will allow for the first time to probe p + p collisions in an energy regime where multiple 
hard parton interactions are present in most of the events. Understanding the underlying 
dynamics of multi-particle production is not only an interesting research topic in itself. 
Equally important is the characterization of the underlying event as prerequisite for 
more specialized studies of exotic and rare channels which LHC is aiming at. 



A. Feynman Scaling 



In his paper [20J, Feynman concluded that the mean number of particles rises 
logarithmically, but does not give a mathematic proof. However, one can assess the 
asymptotic behaviour by rewriting Eq. ([3]) in the form of the invariant cross section^]: 

^E-^- = f i {p r ,x F ). (A.l) 
cr dp z d z p F 

fi factorizes approximately (found experimentally) and a normalization of gi is chosen 
such that 

J fi(PT,x F )d 2 p T = fi(x F ) J gi(p T )d 2 p T = fi(x F ). (A.2) 



=i 

Integration of Eq. (lA.lj) and application of Eq. f ]A.2j) yields: 
1 d 3 cr d 3 p f d 3 p 

cr dp z d z p F hi J hi 



fM * P ' f , (A.3) 

% The definition of the Feynman function is different in some publications (e.g. 138 ]), not considering 
the 1/cr term in Eq. (TO]) . This approach, however, results in conclusions that are not confirmed by 
experiment. In detail compared to the results of the calculation presented in the following, the left sides 
of Eqs. (jA.8[) and (|A.9[) have to be multiplied by a. 
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where on the left side the definition of the invariant cross section is used with the average 
particle multiplicity (N), and for tut an effective average-p^ is used. 

Rewriting in xp yields the expression used to prove Feynman's hypothesis: 

(N) = f fi(x F ) r^^ - (A.4) 

Y X F ' W 2 

The integral is symmetric because fi(xp) is symmetric for collisions of identical particles. 
For other collision systems the integration can be performed separately for negative and 
positive xp and yields the same result. fi(xp) < B is finite and bounded due to energy 
conservation. Furthermore, Feynman assumes that for xp = a finite limit is reached. 
Therefore: 

dXF (A.5) 




(A.6) 

2Bln^. (A.7) 

The first term can be shown to be constant for W — > oo and the second is proportional 
to InW. 

In consequence, Feynman scaling implies that the average total multiplicity scales 

as 

(N) oc In W oc In y/s. (A.8) 

Considering that the maximum reachable rapidity in a collisions increases also with 
In y/s, and under the further assumption that the particles are evenly distributed in 
rapidity, it follows that dN/dy is independent of y/s: 

= constant. (A. 9) 

dy 

B. Uncertainties on Moments 

Given a distribution P(n) which is normalized to 1 with an uncertainty e n , and assuming 
that the errors on the individual bins are uncorrelated (which may not be the case after 
an unfolding procedure is applied, see also Section 13.1. 2D their errors can be calculated 
using the partial derivatives: 



dC q n q (n) — (n q )qn 



dP(n) (n)i +1 

dF q n(n — l)...(n — q + — (n(n — l)...(n — q + l))qn 
dP{n) = {n)i+ l ' 



(B.l) 
(B.2) 
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ap(n) 

The total is then 



(B.3) 



<9X x 2 



*? - £ ) • (B.4) 



where X q is C q , F q , or 



C. Relation of NBD and BD 



This section shows that a NBD becomes binomial when k is a negative integer. To start 
with the NBD and the binomial distribution (BD) are recalled. The NBD is: 

pNBDr \ — ( n + k-1 \ ( (n)/k \ n 1 

F ^ n >-[ n ) \l + (n)/k) (T+WW { } 

with n failures and k successes. The BD is 



m 



CW = [ n I -/')"'-" (C.2) 

with m trials, n successes, and success probability p. Important is that for both, the 
NBD and the BD, the running variable is n. 
Using Eq. ( JTTj) . the NBD is rewritten as 

nbd/ \ (n + k — 1) ■ (n + k — 2) ■ ... ■ k 



nl 

-k—n 



{{n)/kY{l + {n)/k)-«-\ (C.3) 
If we identify m with — k and p with —(n)/k we find: 

prsw = ( ""'"" 1> - ( ";,'"" 2> --- ( "'" ) (- P )" (i - P)-" (c.4) 

Assuming that n — m — 1 < and —m < 0, all terms in the product are negative and 
the following relation holds: 

(n — m — 1) • (n — m — 2) • ... • (— m) = 

(-1)™ • (-ra + m + 1) • (-71 + m + 2) • ... • m. (C.5) 

Eq. flOil) is then: 

p NBD (n) = (m-n + l)-(m-n + 2)-...-m pW(1 _ p)m _ n _ (Q g) 

Applying Eq. (JTTj) the first term can be identified as the binomial term of the BD: 

(m — n + 1) ■ (to — n + 2) • ... - m 




nl 



(C.7) 



Thus the NBD with negative integer k is a BD with the Bernoulli probability p = — (n) / k 
and the number of trials m = —k. For such a BD the assumptions made above are indeed 
fulfilled: — m < k < (m > 0) follows trivially; the number of successes n is smaller or 
equal than the number of trials m and therefore also n — m — 1 < 0. It is required that 
< p < 1, thus < (n) < 
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